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Transcription factor cistromes are highly cell-type specific. Chromatin accessibility, histone modifications, and nucleosome 
occupancy have all been found to play a role in defining these binding locations. Here, we show that hormone-induced 
DNase I hypersensitivity changes (ADHS) are highly predictive of androgen receptor (AR) and estrogen receptor 1 (ESR1) 
binding in prostate cancer and breast cancer cells, respectively. While chromatin structure prior to receptor binding and 
nucleosome occupancy after binding are strikingly different for ESR1 and AR, ADHS is highly predictive for both. AR 
binding is associated with changes in both local nucleosome occupancy and DNase I hypersensitivity. In contrast, while 
global ESR1 binding is unrelated to changes in nucleosome occupancy, DNase I hypersensitivity dynamics are also predictive 
of the ESR1 cistrome. These findings suggest that AR and ESR1 have distinct modes of interaction with chromatin and that 
DNase I hypersensitivity dynamics provides a general approach for predicting cell-type specific cistromes. 



[Supplemental material is available for this article.] 

In eukaryotes, transcription is regulated in a cell-type and condi- 
tion-specific manner through the association of transcription 
factors with chromatin. The genome-wide binding sites of tran- 
scription factors, or the transcription factor cistromes, are influ- 
enced by the active protein levels of the transcription factors, 
chromatin structure, and DNA sequence. The nucleosome is the 
fundamental unit of chromatin structure and has been thought to 
compete with transcription factors for occupancy at thermody- 
namically favorable genomic loci. By comparing nucleosome oc- 
cupancy maps generated from nucleosome-resolution H3K4me2 
ChlP-seq, we found that nucleosome occupancy changes can be 
predictive of transcription factor cistromes. In particular, the 
binding of androgen receptor (AR) in prostate cancer LNCaP cells 
leads to an increased occupancy of nucleosomes flanking the AR 
binding site and decreased nucleosome occupancy in the position 
of the binding site itself (He et al. 2010). This approach also cor- 
rectly predicted the binding of two factors, POU2F1 and NKX3-1, 
which are part of the secondary cellular response to androgens (He 
et al. 2010). This phenomenon is not unique to the LNCaP AR 
system; it has also been observed with CDX2, HNF4A, and GATA6 
binding in intestinal differentiation (Verzi et al. 2010) and with 
GATA1 in hematopoiesis (Hu et al. 2011). 

DNase I hypersensitivity is an alternative measure of chro- 
matin accessibility (Wu 1980). DNase I hypersensitive sites (DHS), 
short regions of chromatin that are highly sensitive to cleavage by 
DNase I, typically occur in nucleosome free regions and frequently 
arise as a result of transcription factor binding. DNase I digestion 
followed by high-throughput sequencing (DNase-seq) has evolved 
into a powerful technique for identifying genome-wide DNase 
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hypersensitive sites (Ling et al. 2010; John et al. 2011; Siersbaek 
et al. 2011). Because transcription factor binding sites tend to be 
DNase I hypersensitive and DNase-seq does not require a factor- 
specific antibody, DNA sequence motif analysis on DHS data has 
been proposed as a method for discovering the binding sites of 
multiple transcription factors in a single experiment (Pique-Regi 
et al. 2011; Song et al. 2011). 

To analyze the effects of androgen receptor (AR) and estrogen 
receptor 1 (ESR1) binding on DHS, we conducted genome-wide 
DNase-seq in both unstimulated and hormone-stimulated condi- 
tions. Using a quantitative measurement of DHS changes (ADHS) 
between these conditions, we were able to predict the ESR1 and AR 
cistromes. Although they are related members of the steroid re- 
ceptor family, AR and ESR1 display distinct DHS profiles. Binding 
of both ESR1 and AR are frequently associated with significant 
increases in DHS signal upon hormone stimulation; however, ESR1 
sites show strong DHS prior to binding and AR sites do not. Fol- 
lowing hormone stimulation, FOXA1 binding sites that lacked AR 
or ESR1 binding are associated with a significant decrease in DHS. 
In MCF-7 cells, this change in DHS is linked not to a change in 
FOXA1 binding but rather to a decrease in the binding of the ESR1 
coactivator, NCOA3, supporting a model of physiologic squelch- 
ing. This study demonstrates that ADHS is a more effective and 
general approach to predict perturbation-induced transcription 
factor binding sites than either static DHS or nucleosome resolu- 
tion H3K4me2 ChlP-seq. 

Results 

Estrogen receptor binding in breast cancer cells is not 
associated with significant nucleosome depletion 

Based on our earlier work demonstrating the association between 
AR binding and nucleosome depletion (He et al. 2010), we carried 
out an H3K4me2 ChlP-seq experiment on MNase digested chro- 
matin in the MCF-7 breast cancer cell line comparing unstimulated 
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(Veh) cells with cells grown under conditions of estrogen stimula- 
tion (E2). Consistent with previous studies (Barski et al. 2007; He 
et al. 2010), the H3K4me2 sites in both samples were mainly located 
in intergenic and intronic regions and were also found to be en- 
riched in promoter regions (Fig. 1A). Over 64% of estrogen recep- 
tor 1 (ESR1) binding sites overlapped with regions enriched in 
H3K4me2 (estrogen-stimulated) (Fig. IB). We examined the distri- 
bution of H3K4me2 signals relative to the center of all ESR1 binding 
sites. Although in some cases ESR1 binds to regions depleted of 
H3K4me2 signal (Supplemental Fig. 1A), in both the vehicle and 
stimulated conditions the overall pattern shows a peak in the 
H3K4me2 signal that overlaps with the ESR1 binding sites (Fig. 1C). 

We systematically assessed ESR1 binding as a function of the 
nucleosome stabilization-destabilization (NSD) score, a measure of 
nucleosome occupancy changes established in previous studies 
(He et al. 2010). The fraction of ESR1 binding sites located in high 
NSD scoring regions was no greater than the fraction in regions 
with low NSD scores (Fig. ID). This pattern is significantly different 
from that observed in AR binding (Supplemental Fig. IB). In AR 
binding an H3K4me2 tag density peak at the AR binding site be- 
comes a trough after androgen stimulation, resulting in high NSD 
scoring regions being highly predictive of AR binding (He et al. 
2010). Whereas in MCF-7 the distributions of NSD scores at ER and 



non-ER sites are not significantly different (Supplemental Fig. 1C, 
P-value = 0.25), the distributions of NSD scores in LNCaP AR and 
non-AR sites are significantly different (Supplemental Fig. ID, 
P-value = 2.2 X 10" 16 ). 

In order to determine whether the differences in the behavior 
of AR in LNCaP cells and ESR1 in MCF7 cells were due to a differ- 
ence in the transcription factors or the cell lines, we analyzed 
H3K4me2 enrichment at AR, ESR1, and FOXA1 sites together (Fig. 
2A,B; Supplemental Fig. 2A,B). We included the winged helix 
transcription factor FOXA1 in the analysis as it acts as a "pioneer 
factor ,; in breast cancer cells and is required for ESR1 binding to 
a large proportion of its binding sites (Carroll et al. 2005; Lupien 
et al. 2008). The role of FOXA1 in AR action in prostate cancer cells 
is more complex, though a significant number of AR-bound sites 
are also bound by FOXA1 (Lupien et al. 2008; Wang et al. 2011). 
Consistent with our previous findings (He et al. 2010), sites bound 
by FOXA1 alone in either LNCaP or MCF7 cells show a pair of 
stimulus-independent peaks that flank a trough directly over the 
FOXA1 binding site (Fig. 2A,B, right panels). 

When we examined the H3K4me2 signal at sites bound by AR 
or ESR1 that lacked FOXA1, we observed very different patterns. In 
LNCaP cells, AR binding sites that do not bind FOXA1 had a broad 
peak of H3K4me2 prior to hormone stimulation that resolved into 
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Figure 1 . Characteristics of H3K4me2 ChlP-seq in MCF-7 cells. (A) Location of H3K4me2 ChlP-enriched peaks relative to gene annotations in unstimulated 
(Veh) and estrogen-stimulated (E2) conditions. (B) Venn diagram of ESR1 binding loci in relation to H3K4me2-enriched regions. (C) Distribution of H3K4me2 
ChlP-seq signal at non-promoter (>1 kb from TSSs) ESR1 binding sites under unstimulated and estrogen-stimulated conditions. (D) The fraction of ESR1 
binding sites in paired nucleosome bins sorted in descending order by NSD score (stimulated vs. unstimulated). Paired nucleosome regions are ranked by the 
NSD score that represents the differences in the H3K4me2 tag counts before and after estrogen treatment. These ranked regions are grouped into bins of 500 
to calculate the proportion of real binding sites as a function of rank. (K-axis) Fraction of the regions in each bin that overlap with ESR1 ChlP-seq enriched 
regions. 
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two sharp peaks flanking the AR binding 
site upon AR activation (Fig. 2A, left 
panels). In contrast, ESR1 binding sites in 
MCF7 cells that lack FOXA1 had a broad 
peak of H3K4me2 centered over the ESR1 
binding site both before and after ESR1 
activation (Fig. 2B, left panels). The pattern 
of H3K4me2 at the shared AR/FOXA1 
and ESR1/FOXA1 sites was also distinct. 
H3K4me2 signal at the AR/FOXA1 bound 
sites indicates nucleosome depletion and 
better positioned flanking nucleosomes 
after AR activation (Fig. 2A, center panels). 
In contrast, the pattern at ESR1/FOXA1 
sites is similar to the ESR1 -unique sites 
and has a single broad peak both before 
and after ESR1 activation (Fig. 2B, center 
panels). NPS, an algorithm that predicts 
nucleosome position (Zhang et al. 2008b), 
also predicts clearly different nucleosome 
distributions relative to ESR1 -unique bind- 
ing sites, FOXA1 -unique binding sites, and 
shared sites (Supplemental Fig. 1C-E). At 
sites of ESR1 binding with or without 
FOXA1, the predicted nucleosomes more 
frequently overlap the ESR1 binding site 
(Supplemental Fig. 1E,F) while FOXA1 
sites that lack ESR1 binding sites have a 
peak of binding that is in a region removed 
from a nucleosome center (Supplemental 
Fig.lG). 

To further test whether the differ- 
ences between ESR1 and AR are intrinsic 
to the transcription factors, we examined 
the MCF-7-derived hormone-independent 
breast cancer cell line MCF-7:2A (Pink 
et al. 1995; Ariazi et al. 2011). While MCF- 
7:2A cells grow in the absence of estrogen 
or androgen, their growth is inhibited by 
silencing of either ESR1 or AR (data not 
shown). Sixty-five percent of the ESR1 
binding sites in MCF-7 under the E2- 
stimulated condition overlap with those 
of MCF-7:2A in the absence of estrogen 
(Supplemental Fig. 2C). While there is 
significant overlap in the ESR1 and AR 
binding sites in MCF-7:2A, there are also 
many ESR1- and AR-unique sites (Fig. 2C, 
Venn diagram). MNase digested H3K4me2 
ChlP-seq in MCF-7:2A was performed, and 
the distribution of H3K4me2 at ESR1- 
unique, AR-unique, and shared sites was 
determined (Fig. 2C; Supplemental Fig. 
2D). At the ESR1 -unique binding sites, 
H3K4me2 formed a sharp, unimodal peak 
at the binding site (Fig. 2C, left panel). In 
contrast, the AR-unique sites are associ- 
ated with a broader H3K4me2 tag distri- 
bution with two modes that flank the AR 
binding site (Fig. 2C, right panel). Shared 
ESR1 and AR binding sites had an 
H3K4me2 profile with an intermediate 
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Figure 2. (Legend on next page) 
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distribution between that of the ESR1 -unique and AR-unique sites 
(Fig. 2C, middle panel). These results suggest that, although AR 
binding involves depletion of a nucleosome directly over the AR 
binding site, ESR1 binding does not. 

Quantitative measures of DNase I dynamics are predictive 
of TF binding 

Given our finding that ESR1 binding could not be predicted by 
changes in the occupancy of H3K4me2 marked nucleosomes, we 
investigated stimulus-dependent changes in DNase I hypersen- 
sitivity (DHS) as a complement to nucleosome occupancy. An 
analysis of DHS under unstimulated (Veh) and androgen-stimu- 
lated (DHT) conditions in the LNCaP cell line demonstrated that 
51% of AR binding sites overlap with androgen-stimulated DHS 
regions (Fig. 3A), as would be expected from our prior work on 
nucleosome occupancy. When we analyzed DHS in MCF-7 cells in 
unstimulated and estrogen-stimulated (E2) conditions, we found 
that —63% of ESR1 binding sites overlap with stimulated DHS re- 
gions (Fig. 3B). In LNCaP cells, increasing the sequencing depth 
from 50 M to 70 M increased the proportion of AR sites that over- 
lapped a DHS site from 51% to 55% (Supplemental Fig. 3 A). Simi- 
larly, increasing the sequencing depth in MCF-7 cells from 28 M to 
70 M raised the proportion of ESR1 sites that overlap with DHS from 
63% to 71% (Supplemental Fig. 3B). ESR1 and AR sites that are not 
associated with DHS show significantly lower levels of binding than 
those that are associated with DHS (Supplemental Fig. 3C,D). 

DHS regions encompass genomic locations that are associated 
with a variety of transcription factors and other chromatin-asso- 
ciated complexes; therefore, we investigated whether changes in 
DHS between conditions can be used to enhance the specificity of 
transcription factor binding site prediction. Starting with the set of 
DHS regions that were detected under hormone-stimulated con- 
ditions, we ranked the regions by three criteria: the DHS tag count 
in the unstimulated condition, the DHS tag count in the stimu- 
lated condition, and a score representing the change in the num- 
ber of tag counts between the two conditions (ADHS) (Fig. 3C,D). 
The results for the LNCaP AR and MCF-7 ESR1 systems were quite 
distinct. In LNCaP cells, the level of DHS is not a strong predictor of 
AR binding in either the unstimulated or stimulated condition, 
although in both cases it is somewhat informative. In contrast, the 
change in DHS, ADHS, is a very strong predictor of AR binding (Fig. 
3C). Interestingly, in the MCF-7 system, the level of DHS under 
unstimulated conditions is slightly predictive of ESR1 binding; 
however, estrogen-stimulated DHS and, most significantly, ADHS 
are progressively superior at predicting ESR1 binding (Fig. 3D). 
These results suggest on a genome-wide scale that at AR and ESR1 
binding sites DHS increases upon receptor binding. 

On a genomic scale, DNA sequence recognition motifs alone 
are poor predictors of in vivo ESR1 and AR binding. However, 
within DHS regions, DNA sequence motifs may be useful for iden- 



Figure 2. Mono-nucleosome level H3K4me2 ChlP-seq at nuclear receptor and FOXA1 binding loci in 
the MCF-7 (A), LNCaP (B), and MCF-7:2A (C) cell lines. (A) (Top panel) Venn diagram of AR binding 
in relation to FOXA1 binding. (Middle panel) Distribution of H3K4me2 signal centered on AR-unique, 
AR/FOXA1 shared, and FOXA1 -unique sites in the unstimulated condition. (Bottom panel) Distribution 
of H3K4me2 signal centered on the AR-unique, AR/FOXA1 shared, and FOXA1 -unique sites under 
conditions of androgen stimulation. (B) (Top panel) Venn diagram of ESR1 binding in relation to 
FOXA1 binding. (Middle panel) Distribution of H3K4me2 signal centered on ESR1 -unique, ESR1/FOXA1 
shared and FOXA1 -unique sites in unstimulated cells. (Bottom panel) Distribution of H3K4me2 signal 
centered on ESR1 -unique, ESR1/FOXA1 shared, and FOXA1 -unique sites in estrogen stimulated cells. 
(C) (Top panel) Venn diagram of ESR1 binding in relation to AR binding. (Bottom panel) Distribution of 
H3K4me2 signal centered on ESR1 -unique, ESR1 /AR shared, and AR-unique sites in unstimulated cells. 



tifying the DHS sites associated with the binding of a particular 
transcription factor. Starting with the set of DHS regions detected in 
the hormone-stimulated condition, we ranked the regions by three 
criteria: ADHS, strength of the AR or ESR1 DNA sequence motif, and 
a combination of the sequence motif and ADHS. In both the LNCaP 
and MCF7 systems, the nuclear receptor binding motifs are capable 
of discerning the binding locations of the specific factors from the 
remainder of the open chromatin regions (Supplemental Fig. 4A,B). 
Therefore, while DNA sequence motifs may not be reliable pre- 
dictors of transcription factor binding across the entire genome 
(Carroll et al. 2006), they are reliable predictors within the regions of 
open chromatin. The best prediction of AR or ESR1 binding, how- 
ever, was obtained by combining ADHS and motif based rankings. 
To further assess the ability of our approach to predict genome-wide 
receptor binding sites, we carried out precision-recall analysis for ESR1 
(Fig. 3E) and AR (Supplemental Fig. 5). Precision is the fraction of 
predicted binding sites that are true positives and recall is the fraction 
of true binding sites identified. As seen for ESR1 in MCF-7 cells, DNA 
sequence motif alone is a poor predictor of binding. Combining static 
DHS peaks with motif yields a significantly better prediction, while 
combining ADHS with motif is most predictive. Interestingly when 
we plotted the precision-recall value for the ESR1 binding sites pre- 
dicted by the CENTIPEDE algorithm (Pique-Regi et al. 201 1) we found 
a point-prediction (see Methods) that is very similar to what we find 
using static DHS plus motif. Thus ADHS plus motif provides a pow- 
erful computational model for TF binding site prediction. 

DNase I hypersensitivity is dependent on combinations 
of bound transcription factors 

We further investigated the influence of combinations of ESR1 and 
AR binding with FOXA1 on ADHS. We found that the majority of 
FOXA1 sites are DHS in the LNCaP (72%) and MCF-7 (64%) cell 
lines (Supplemental Fig. 6). Interestingly, while DHS tends to in- 
crease at shared nuclear receptor FOXA1 sites, FOXA1 sites that do 
not overlap with AR or ESR1 loci after stimulation are associated 
with a decrease in DHS (Fig. 3F,G). In addition, ADHS at nuclear 
receptor binding loci are modified by the presence of FOXA1 in 
a cell line dependent fashion. In MCF-7, ESR1 sites that overlap 
with FOXA1 loci tend to show larger increases in DHS than the 
non-FOXAl binding site containing ESR1 sites (Fig. 3G). In con- 
trast, we observe a larger ADHS in non-FOXAl AR binding sites 
than in the AR sites that overlap with FOXA1 (Fig. 3F) in LNCaP 
cells, despite the fact that the hormone-stimulated DHS signals in 
both cell lines are greatest at the shared nuclear receptor-FOXAl 
shared sites (Supplemental Fig. 7). 

Coactivator activity is detected by ADHS 

One motivation for generating cistromes is to gain insight into the 
regulation of gene expression. To determine if ADHS can inform 
transcriptional regulation, we compared published LNCaP gene 
expression data (Wang et al. 2009) and 

MCF-7 GRO-seq data (Hah et al. 2011) 

with three sets of DHS sites: hormone- 
increased (top 5000 ADHS), hormone- 
diminished (bottom 5000 ADHS), and 
hormone-unchanged (middle 5000 ADHS). 
In both LNCaP and MCF-7 cells, the ratio of 
up-regulated genes to non-regulated genes 
(odds ratio) has a strong positive associa- 
tion with the hormone-increased DHS sites 
within 20 kb of the transcription start site 
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(TSS) (Fig. 4A, red bars). In contrast, there is no positive association 
between hormone-unchanged or -diminished DHS sites with in- 
creased gene expression (Fig. 4A, blue and green bars). 



We have previously shown using ESR1 ChlP-chip and gene ex- 
pression microarrays in MCF-7 that early up-regulated genes, which 
increased after 3 h of hormone stimulation, are strongly associated 
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with ESR1 binding, whereas the early 
down-regulated genes are not (Carroll 
et al. 2006). These findings were con- 
firmed by Hah and colleagues using GRO- 
seq (Hah et al. 2011). Interestingly we 
find a strong association of early down- 
regulated genes with the hormone- 
diminished DHS sites (Fig. 4B, green 
bars). Motif analysis shows that, while 
the hormone-induced DHS regions are 
enriched for motifs for ESR1, forkhead and 
AP-1, the hormone-diminished DHS sites 
are enriched primarily for the forkhead 
motif and not the ESR1 motif (Table 1). 
We confirmed that FOXA1 binding is 
enriched at the sites with both the highest 
and lowest ADHS using FOXA1 ChlP-seq 
data (Fig. 5A). Interestingly, the FOXA1 
sites lacking ESR1 are only strongly as- 
sociated with the sites with the lowest 
ADHS (Fig. 5B). One explanation for these 
findings would be that, at sites where 
FOXA1 is bound in the absence of ESR1, 
FOXA1 binding is reduced upon estrogen 
stimulation. 

To investigate whether FOXA1 sites 
without ESR1 binding have reduced en- 
richment upon stimulation, we compared 
the FOXA1 ChlP-seq reads under vehicle 
and stimulated conditions (Joseph et al. 
2010) within the three categories of 5000 
DHS sites (Fig. 5C). Starting with DHS re- 
gions detected in the E2-stimulated con- 
dition we counted the number of FOXA1 
tags obtained from ChlP-seq in unstimu- 
lated and E2-stimulated conditions. If we restrict the set of DHS 
regions to include only the middle 5000 hormone-unchanged re- 
gions and plot the FOXA1 tag count for the stimulated condition as 
a function of that for the unstimulated condition, we see a linear 
trend, represented by the blue regression line in Figure 5C. In 
a similar way if we select the top 5000 hormone-increased DHS sites, 
we again see a linear trend but the slope of the regression line (red) 
for this trend is greater. This indicates that there is more hormone- 
stimulated FOXA1 binding in the hormone-increased set than in 
the hormone-unchanged set. If we select the top 5000 hormone- 
diminished sites and plot a regression line (green), we see the slope 
of the regression line through the hormone-diminished set is not 
significantly lower than that of the hormone-unchanged set. A re- 
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Figure 4. Association between dynamic DNase-seq and differentially expressed genes. Three groups 
of DHS are represented in LNCaP and MCF-7 cells: hormone-induced DHS sites (red); hormone-un- 
changed DHS sites (blue); and hormone-diminished DHS sites (green). (K-axis) Odds ratio calculated by 
the following formula: (up-regulated genes with at least one nearby site/non-regulated genes with at 
least one nearby site)/(up-regulated genes with no nearby site/non-regulated genes with no nearby 
site). In this definition, "nearby" means within 20 kb of the TSS. The hormone-induced sites are asso- 
ciated with up-regulated genes (A), while the hormone-depleted sites are associated with down-reg- 
ulated genes (8). 



duction in FOXA1 binding does not, therefore, appear to explain 
the decrease in DHS. 

Physiological squelching (Meyer et al. 1989) has been postulated 
to be an important mode of early estrogen down-regulation (Carroll 
et al. 2006). This phenomenon occurs when multiple factors in the 
same cell share a common factor, such as a coactivator that is present 
at a limiting concentration. The transcription factors interfere with 
each other, "squelching" each other's influence. Of the numerous 
known ESR1 coactivators, NCOA3 has been shown to have a partic- 
ularly strong synergy with ESR1 in enhancing gene expression 
(Torchia et al. 1997). As with FOXA1, NCOA3 binding was associated 
with both the highest and lowest ADHS sites overall and with only 
the lowest ADHS sites at loci lacking ESR1 (Supplemental Fig. 8). 



Figure 3. Characteristics of DNase I hypersensitivity sequencing. (A) Venn diagram of the DHS and AR peaks in LNCaP. The DNase-seq sequencing depth 
was normalized to the lower sequencing depth for the unstimulated (50 M) and androgen-stimulated (70 M) conditions. (B) Venn diagram of the DHS and 
ESR1 peaks in MCF-7. DNase-seq sequencing depth was normalized to the lower sequencing depth of the unstimulated (28 M) and estrogen-stimulated (70 
M) conditions. (C,D) The fraction of LNCaP AR(C) or MCF-7 ESR1 (D) binding sites in bins ranked by three measures: DNase-seq tag counts in stimulated and 
unstimulated conditions and a score, ADHS, representing the change in DNase I hypersensitivity between the two conditions. The DNase-seq peak regions 
under the stimulated condition are ranked by these measures. To calculate the proportion of real binding sites as a function of rank, these ranked regions are 
grouped into bins of 500. (K-axis) Fraction of regions in each bin that overlap with AR (C) or ESR1 (D) ChlP-seq enriched regions. (£) The precision-recall curves 
for prediction power of MCF-7 ESR1 binding sites were calculated by five measures: ADHS, ESR1 motif, ESR1 motif in E2 DHS, sqrt([ADHS rank]*[motif rank]), 
and results generated by the CENTIPEDE algorithm on ENCODE MCF-7 DNase-seq data (see Methods). (F,C) Box plots showing the distribution of the 
DNase-seq change (ADHS) between the unstimulated and stimulated conditions in LNCaP (F) and MCF-7 (C) cells. "AN" represents all the DHS sites in MCF-7 
and LNCaP; "AR not FOXA1 " and "ESR1 not FOXA1 " represent AR and ESR1 binding sites that do not overlap with FOXA1 ; "AR and FOXA1 " and "ESR1 and 
FOXA1 " represent AR and ESR1 binding sites that overlap with FOXA1 ; "FOXA1 not AR" and "FOXA1 not ESR1 " represent FOXA1 binding sites that do not 
overlap with AR and ESR1 . (**) Wilcoxon rank-sum test P-values <0.01, comparing "all" with the other categories. 
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Table 1. Top 20 motifs enriched in the top 5000 and bottom 5000 MCF-7 ADHS regions 



Top 5000 ADHS Bottom 5000 ADHS 



Motif in 

IVIOIIT 1 U 


U 1 

Gene symbol 


U f UN- 

Number ot hits 


P-value 


Motif in 

IVIOIIT 1 U 


U 1 

Gene symbol 


Number ot hits 




P- value 


M00959 


ESR 7 


2002 


1.00 


X 


10 30 


M00724 


FOXA1 


3806 


1 


.00 


x 10 30 


M00515 


PPARG 


997 


1.00 


X 


10~ 30 


M00131 


FOXA2 


3782 


1 


.00 


x 10 30 


M00925 


JUN/FOS 


1557 


1.00 


X 


IO" 30 


M01012 


FOXM1 


4283 


1 


.00 


x 10 30 


M00156 


RORA 


2797 


1.00 


X 


IO" 30 


M00269 


FOXA2 


4803 


1 


.00 


x 1 0 30 


M00037 


NFE2 


1696 


1.00 


X 


io 30 


M00292 


FOXD1 


3250 


1 


.00 


x IO" 30 


M00285 


NFE2L1 


4634 


1.00 


X 


IO" 30 


M00422 


FOXJ2 


4171 


1 


.00 


x 10 30 


M00495 


BACH1 


972 


1 .00 


x 


1Q -30 


M00290 


FOXF2 


4927 


1 


.00 


x 10 30 


M00490 


BACH2 


875 


1.00 


X 


10~ 30 


M00291 


FOXC1 


4891 


1 


.00 


x 10 30 


M00239 


NRW1 


1531 


1.00 


X 


io- 30 


M00289 


FOXI1 


4747 


1 


.00 


x 10 30 


M00727 


SF1 


4133 


1.00 


X 


io- 30 


M00266 


CROCC 


4693 


1 


.00 


x 10 30 


M00511 


SLC7A1 


3291 


1.00 


X 


10" 30 


M00268 


XFD2 


4962 


1 


.00 


x 10 30 


M01138 


ROR1 


3723 


1.00 


X 


10" 30 


M01137 


FOX03 


4496 


1 


.00 


x 10 30 


M00292 


FOXD1 


4559 


1.00 


X 


IO" 30 


M00809 


FOX factors 


3571 


1 


.00 


x IO 30 


M00157 


ROR2 


1568 


1.00 


X 


IO 30 


M00472 


FOX04 


4543 


1 


.00 


x 10 30 


M00204 


CCN4 


808 


1.00 


X 


IO" 30 


M00742 


FOXJ1 


4768 


1 


.00 


x 10 30 


M00821 


NFE2L2 


2809 


1.00 


X 


IO" 30 


M00267 


XFD1 


4583 


1 


.00 


x 10 30 


M00035 


MAF 


2896 


1.00 


X 


10 30 


M00951 


GRHL3 


3561 


1 


.00 


x IO 30 


M00269 


FOXA2 


2247 


2.06 


X 


IO" 27 


M00294 


FOXF1 


4859 


1 


.00 


x 10 30 


M00724 


FOXA1 


4452 


1.34 


X 


IO" 25 


M00475 


FOX03 


4259 


1 


.00 


x 10 30 


M00983 


MAF 


949 


2.89 


X 


IO 25 


M00473 


FOXOI 


4800 


1 


.00 


x 10~ 30 



Using published MCF-7 NCOA3 ChlP-seq data (Joseph et al. 
2010; Lanz et al. 2010), we compared NCOA3 and FOXA1 cis- 
tromes, finding 61% of FOXA1 binding sites overlap with NCOA3 
(Supplemental Fig. 9). Analyzing the three categories of DHS sites 
using this NCOA3 ChlP-seq data, we found that NCOA3 binding 
associated with hormone-diminished DHS loci was distributed in 
a clearly distinct pattern from the hormone-unchanged sites (Fig. 
5D). The slope of the regression line of the hormone-diminished 
set was significantly lower than that of the hormone-unchanged 
set (Fig. 5D). As ESR1 directly interacts with NCOA3, these data 
support the hypothesis that ESR1 competes with FOXA1 for lim- 
ited amounts of NCOA3 that are either directly associated with 
FOXA1 or associated with other transcription factors whose 
binding is facilitated by FOXA1. 

If physiological squelching is responsible for the E2-stimu- 
lated loss of NCOA3 at FOXA1 binding sites, then higher con- 
centrations of NCOA3 in the nucleus should result in a reduced E2- 
stimulated NCOA3 loss. We tested this by overexpressing NCOA3 
(Supplemental Fig. 10A), selecting six FOXA1 non-ESRl binding 
sites from hormone-diminished DHS and determining NCOA3 
and FOXA1 binding strength by ChlP-qPCR (Supplemental Fig. 10). 
The control confirms what we found in the ChlP-seq data: FOXA1 
binding does not significantly change on E2 stimulation and there 
is NCOA3 loss (Fig. 5E). In the NCOA3 overexpression experiment, 
however, we find no significant change in either FOXA1 or NCOA3 
binding on E2 stimulation (Fig. 5F). We also examined the effect 
of NCOA3 overexpression on the expression of five genes down- 
regulated by estrogen and found that NCOA3 overexpression re- 
duced the extent of these expression changes (Supplemental Fig. 
11). These results are consistent with the physiological squelching 
mechanism in which E2-induced ESR1 binding sites compete with 
FOXA1 sites for the NCOA3 coregulator. 

Discussion 

Using genome-wide DNase-seq and H3K4me2 ChlP-seq analyses, 
we have mapped important features of enhancer-associated chro- 
matin. We observed systematic differences in nucleosome occupancy 



patterns and DHS associated with different transcription factors in 
LNCaP and MCF-7 cell lines. While AR binding in LNCaP cells has 
large effects on nucleosome occupancy, ESR1 binding in MCF-7 
cells is not strongly influenced by, nor does it influence, nucleo- 
some occupancy. In LNCaP cells, it has been reported that a 
knockdown of FOXA1 expression causes a dramatic change in AR 
binding locations, including the gain of numerous sites that are 
not observed under normal FOXA1 conditions (Wang et al. 2011). 
Notably, these new AR binding sites were not associated with ob- 
servable nucleosome remodeling but were more like the ESR1 
binding we observed in MCF-7 cells. 

Thermodynamic equilibrium has been proposed to explain 
experimentally observed genome-wide in vivo nucleosome occu- 
pancy patterns. In this model, both nucleosomes and transcription 
factors have an intrinsic affinity for DNA sequence that is de- 
pendent on sequence composition (Segal and Widom 2009). 
Transcription factors compete with nucleosomes for DNA, and 
thermodynamic equilibrium determines the configuration of nu- 
cleosomes and transcription factors. In addition, nucleosome oc- 
cupancy is likely to be shaped by kinetic elements, in particular, 
chromatin-remodeling factors using the energy derived from ATP 
hydrolysis to actively modify DNA-histone interactions. The im- 
portance of ATP-dependent factors was demonstrated in a recent 
study that showed that ATP is essential for creating the strongly 
positioned nucleosome arrays observed near TSSs in Saccharomyces 
(Zhang et al. 2011). Experimental evidence shows that different 
chromatin remodeling enzymes are recruited to enhancer loci by 
sequence-specific transcription factors (Peterson and Workman 
2000), such as nuclear receptors. For example, BRG-1, the active 
component of human SWI/SNF chromatin-remodeling com- 
plexes, has been shown to be a key factor that potentiates AR- and 
ESR1 -regulated transcription (DiRenzo et al. 2000; Dai et al. 2008). 
Both AR and ESR1 are known to interact directly with BAF57, 
a component of the SWI/SNF remodeling complexes (Belandia 
et al. 2002; Link et al. 2005). Several modes of chromatin remod- 
eling have been suggested, including nucleosome sliding, nucle- 
osome eviction, and looping of DNA away from the histone core. 
We speculate that the distinct mechanisms of the different classes 
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Figure 5. DNase I hypersensitivity changes at FOXA1 and NCOA3 sites. Association of ADHS with 
FOXA1 sites in the presence (A) and absence (B) of ESR1 binding. MCF-7 DHS in the estrogen-stimulated 
condition were ranked in descending order based on the ADHS score. These ranked regions are grouped 
into bins of 500. (K-axis) Fraction of regions that overlap with FOXA1 ChlP-seq enriched regions. Scatter 
plots of FOXA1 (C) and NCOA3 (D) ChlP-seq tag counts in the stimulated condition compared with 
counts in the unstimulated condition. Three groups of 5000 DHS sites were selected from the MCF-7 
estrogen-stimulated DHS sites: DHS-increased (red), DHS-unchanged (blue), and DHS-diminished 
(green). Regression lines were drawn for each of the groups. The steeper the slope of a regression line, 
the greater the binding of the factor in the E2-stimulated condition relative to the unstimulated con- 
dition. While the slope for FOXA1 in the DHS-diminished category is not significantly different from that 
in the DHS-unchanged category, the slope for NCOA3 in the DHS-diminished category is less than that 
for the DHS-unchanged category. This means that within the DHS-diminished category NCOA3 binding 
tends to decrease on E2 stimulation while FOXA1 binding is maintained at the same level. Changes of 
FOXA1 and NCOA3 binding strength at FOXA1 binding sites in the overexpression control (f ) and NCOA3 
overexpression (F) samples under stimulated and unstimulated conditions. Six FOXA1 binding sites were 
selected from the hormone-diminished DHS sites. Box plots were generated from the ChlP-qPCR data of the 
six sites tested. The individual ChlP-qPCR assays are shown in Supplemental Figure 1 0. 



of ATP-dependent remodeling enzymes may explain the differential 
chromatin effects seen in our experiments. Our study demonstrates 
how MNase digestion and DHS chromatin assays provide comple- 
mentary information on chromatin structure. 



Our differential DNase I hypersensi- 
tivity experiments revealed a surprising 
link between coregulator and chromatin 
structure. Significantly, this link was not 
merely a consequence of FOXA1 binding 
itself. NCOA3 ChlP-seq in MCF-7 cells 
under vehicle and estrogen-induced con- 
ditions revealed that, although a high 
overlap between NCOA3 and ESR1 was 
observed, an unexpectedly high overlap 
between FOXA1 binding sites and NCOA3- 
enriched loci was also found (Lanz et al. 
2010). Previously, coregulators and chro- 
matin remodeling activity had been shown 
to act synergistically in the AR and ESR1 
systems in collaboration with the AR and 
ESR1 factors themselves (Metivier et al. 
2003; Wang et al. 2005). Here, we find 
evidence for a chromatin remodeling- 
coregulator synergy that is associated 
with FOXA1 in the absence of ESR1 or AR. 
Our experiment supports the hypothesis 
that physiological squelching is an im- 
portant mechanism involved in the 
down-regulation of genes at early time 
points following estrogen treatment. 

According to our current under- 
standing, DNase I hypersensitivity occurs 
in nucleosome free regions that are close 
to transcription factor binding sites. Al- 
though we do observe many DHS in non- 
nucleosomal DNA, DHS sites sometimes 
occur in regions having high nucleosome 
occupancy. In particular, we identified 
a set of DHS sites that were associated with 
ESR1 binding to nucleosomal DNA. The 
different nucleosome occupancy and 
DNase I hypersensitivity patterns that we 
observed are likely dependent on not 
only the details of the transcription fac- 
tor-DNA interaction but also on the 
chromatin environment at the binding 
site. Relevant aspects of the chromatin 
environment may include post-trans- 
lational histone modifications, the com- 
position of the nucleosomes themselves, 
and the presence of other protein com- 
plexes. Histone post-translational modi- 
fications may influence transcription 
factor binding by enhancing the affinity 
of transcription factor related protein 
complexes for the modified histone or 
by reducing the affinity of the histone 
octamer for DNA. The structure of the 
nucleosome cores may also determine 
nucleosomes as being more or less per- 
missive to transcription factor binding as 
histones that constitute nucleosomes 
come in variants, such as H2A.Z, that have been reported to alter 
nucleosome properties (Jin et al. 2009). 

In our analysis of genome- wide dynamic DNase-seq, we noted 
three important factors that contribute to DNase I hypersensitivity. 
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First, in agreement with the standard view, the majority of DHS sites 
occur in nucleosome-free regions. Second, DHS frequently arises as 
a result of transcription factor binding; however, they do not nec- 
essarily occur in nucleosome-free regions. Third, DHS can change 
with the addition or removal of cofactors. We demonstrated that 
dynamic DNase-seq is an effective and informative approach that 
can be used to locate enhancers that regulate a cell's transcriptional 
response to stimuli. 

Methods 

Cell line and culture conditions 

The prostate cancer cell line LNCaP was obtained from the Amer- 
ican Type Culture Collection. LNCaP cells were maintained in 
RPMI 1640 medium supplemented with 10% fetal bovine serum 
(FBS), 2 mM glutamine, 100 U/mL penicillin, and 100 mg/mL 
streptomycin. The hormone-independent breast cancer cell line 
MCF-7:2A and the parental MCF-7 cell line were from V. Craig 
Jordan's lab. MCF-7 cells were maintained in RPMI 1640 medium 
supplemented with 10% fetal bovine serum (FBS), 2 mM gluta- 
mine, 100 U/mL penicillin, 100 mg/mL streptomycin, lx NEAA, 
and 6 |xg/L insulin. MCF-7:2A cells were maintained in phenol- 
red-free RPMI 1640 medium supplemented with 10% charcoal 
stripped FBS. LNCaP and MCF-7 cells were starved in phenol-red- 
free medium supplemented with 10% charcoal stripped FBS for 
3 d before hormone stimulation. 

ChIP and ChlP-seq 

The ChIP experiments were performed as previously described (He 
et al. 2010). We used antibodies to ESR1 (Ab-10 from Neomarkers; 
HC-20 from Santa Cruz), AR (N-20 from Santa Cruz), FOXA1 
(ab23738 from Abeam), and H3K4me2 (07-030 from Millipore). 
Library construction was performed using the Illumina ChlP-seq 
DNA sample Prep Kit according to the manufacture's instruction; 
the libraries were sequenced at a length of 35 bp with the Illumina 
Genome Analyzer. Model-based Analysis of ChlP-seq (MACS) 
software (Zhang et al. 2008a) was used to detect ChlP-seq peak 
regions. Nucleosome Positioning from Sequencing (NPS) software 
(Zhang et al. 2008b) was used to identify nucleosome positions 
based on the H3K4me2 ChlP-seq data. Binding Inference from 
Nucleosome Occupancy Changes (BINOCh) software (Meyer et al. 
201 1) was used to predict transcription factor binding events from 
the H3K4me2 NPS data. 

DNase hypersensitivity mapping 

DNase hypersensitivity mapping was performed as previously de- 
scribed with brief modifications (Ling et al. 2010; John et al. 201 1). 
LNCaP cells were starved for 3 d in phenol-red-free medium sup- 
plemented with 10% charcoal stripped FBS and then treated with 
ethanol or active androgen 5a-dihydrotestosterone (DHT) at a fi- 
nal concentration of 10 nM for 4 h. MCF-7 cells were starved the 
same way and then treated with ethanol or 17£-estrodial (E2) at 
a final concentration of 10 nM for 45 min. The cells were trypsi- 
nized and pelleted prior to washing and resuspension in buffer A 
(15 mM Tris-Cl [pH 8.0], 15 mM NaCl, 60 mM KC1, 1 mM EDTA 
[pH 8.0], 0.5 mM EGTA [pH 8.0], 0.5 mM spermidine, and 0.15 mM 
spermine) to a final concentration of 2 X 10 M cells/mL. Nuclei 
were extracted by adding buffer A containing NP-40. The nuclei 
were washed with buffer A and resuspended in prewarmed lysis 
buffer (13.5 mM Tris-HCl [pH 8.0], 87 mM NaCl, 54 mM KC1, 6 mM 
CaC12, 0.9 mM EDTA, 0.45 mM EGTA) at a concentration of 5 M/mL 
and then digested with different amounts of DNase I (Roche, 0-75 



U) for 5 min at 37°C. The reactions were terminated by the addition 
of an equal volume of stop buffer (1 M Tris-Cl [pH 8.0], 5 M NaCl, 
20% SDS, 0.5 M EDTA [pH 8.0], and 10 [xg/mL of RNase A [Roche]) 
and incubated at 55°C. After 15 min, Proteinase K (final concen- 
tration of 20 |xg/mL) was added to each digestion reaction and 
incubated for 2 h at 55°C. DNA was extracted by careful phenol- 
chloroform purification. The isolated DNA was run out on a gel, and 
DNA fragments between 100 and 400 bp long were gel-selected. The 
libraries were prepared following the Illumina library preparation 
protocol. DNase-seq libraries were sequenced at the Beijing Geno- 
mic Institute and the Center for Cancer Computational Biology 
(CCCB) at the Dana-Farber Cancer Institute. 

NCOA3 overexpression experiments 

A total of 12 |xg of pcDNA3.1-NCOA3 construct or the control 
empty vector were transfected in MCF-7 cells in 10-cm culture 
dishes using lipofectamine 2000 (Invitrogen) according to the 
manufacturer's instructions. After 72 h of transfection, cells were 
treated with estrogen or ethanol control for 45 min and then 
processed for ChlP-qPCR. For RT-qPCR, 3 |xg of the pcDNA3.1- 
NCOA3 or the empty vector were transfected in MCF-7 cells in six- 
well plates. After 72 h of transfection, cells were treated with es- 
trogen or ethanol control for 3 h. RNA was isolated using RNeasy 
mini kit (Qiagen) following the manufacturer's instructions. PCR 
primers used in this work are listed in Supplemental Table 1. 

Model for identifying differential DNase I 
hypersensitivity locations 

DNase I hypersensitive regions were identified using MACS with 
the default parameters. A tag was considered to belong to a geno- 
mic interval if, when shifted 100 bp in a strand-directed direction, 
the entire tag fell within that interval. Each peak i from the set of m 
MACS peaks was then given a DHS change score (ADHS) by the 
formula: 

&DHSi = ^ n i eat I ^ n trea t y m - ^flf ntro1 j (^„contro^ / m 

In this formula, rii is the tag count in a 600-bp interval cen- 
tered on the z-th MACS peak. The superscripts treat and control refer 
to the hormone-stimulated and vehicle conditions, respectively. 
We use the square root transformation to stabilize the variance of 
the score, allowing regions with high counts to be compared with 
those having low counts. Peaks within 1 kb of any RefSeq TSS were 
excluded from all analyses so as not to confound transcription 
factor binding effects with transcriptional ones. All analyses in- 
volving motifs enriched in the peak regions were identified using 
the BINOCh motif analysis software. 

Precision recall analysis 

To evaluate the ability of our method to predict TF binding we 
defined a set of bound and unbound genomic locations. We de- 
fined the bound set as the summits of MACS peaks determined 
from ChlP-seq data and located >1 kb from the nearest RefSeq TSS. 
To define the unbound set, we downloaded a file of "mappable" ge- 
nomic locations, "wgEncodeCrgMapabilityAlign50mer.bw.gz" from 
http://hgdownload.cse.ucsc.edu/goldenPath/hgl8/encodeDCC/ 
wgEncodeMapability/ and selected a set of 850,000 non-bound, 
non-TSS sites by randomly sampling genomic locations that had 
a mappability index >0.9. These locations were filtered to not lie 
within 1 kb of any RefSeq TSS, TF ChlP-seq summit or other ran- 
dom location. The background was then scaled up to cover 2 Gb, 
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the size of the mappable genome. A DHS or ChlP-seq region was 
considered to be a true positive if its center was within 250 bp of 
a TF summit and a false positive if its center was within 250 bp of 
a background site. For motif analysis 200 bp from the center of the 
DHS or ChlP-seq region was scanned using the BINOCh software 
(Meyer et al. 2011). CENTIPEDE predictions (Pique-Regi et al. 
2011) for ESR1 binding in MCF-7 were downloaded from http:// 
centipede.uchicago.edu/SimpleMulti/. In the performance eval- 
uation CENTIPEDE predictions were treated the same way as 
our DHS regions. Since the result we retrieved from the website 
contains no scoring information for the sites predicted by 
CENTIPEDE, a single point was drawn for the performance 
evaluation. 

DHS boxplots 

Tag counting under DHS peaks was carried out as before. Peaks 
were considered to be overlapping if their summits were within 
600 bp of each other. Box plots were produced using R with default 
parameters. The outliers beyond the whiskers are not shown. The 
P-values were calculated using the Mann-Whitney test. 

Gene expression data 

Affymetrix U133 Plus 2.0 microarray data (GSE7868) (Wang et al. 
2007) in LNCaP cells and the processed GRO-seq gene expression 
data (GSE27463) (Hah et al. 2011) in MCF-7 cells were used in this 
study. The microarray data were analyzed using the RMA algorithm 
(Irizarry et al. 2003) using a custom CDF probe (vll) mapping to 
the RefSeq genes (Dai et al. 2005). The statistical significance was 
calculated using limma software (Smyth and Speed 2003). 

Data access 

MCF-7 H3K4me2 ChlP-seq, LNCaP, and MCF-7 DNase-seq raw 
sequence tags, and processed bed files have been submitted to the 
NCBI Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih. 
gov/geo) under accession number GSE33216. 
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